BigDFT.PostProcessing module

A module for post processing BigDFT calculations.

class BigDFTool(omp=1, mpi_run=None)[source]

This module defines a number of post-processing options, including those that can be driven by utilities like bigdft-tool, memguess, or utilities.

Parameters:
  • omp (int) – number of OpenMP threads. It defaults to the $OMP_NUM_THREADS variable in the environment, if present, otherwise it fixes the run to 1 thread.

  • mpi_run (str) – define the MPI command to be used. It defaults to the value $BIGDFT_MPIRUN of the environment, if present.

auto_fragment(system, view, cutoff, verbose=False, rand=False, criteria='bondorder')[source]

Refragment a system based on the results of a previous calculation.

The auto fragment protocol performs a greedy optimization using either the bond order or the distance between fragments as a guide. The purity values and the fragment bond orders are essential quantities for this calculation, so they can be passed if they are already cached.

By using the rand keyword, you can trigger a stochastic refragmentation process. If this process is repeated many times, you may find a result that improves upon the greedy optimization approach.

Parameters:
  • system (BigDFT.Systems.System) – the system to fragment.

  • view (BigDFT.Systems.FragmentView) – a view of the system.

  • cutoff (float) – the termination criteria. When the worst fragment is more pure than this cutoff, the calculation stops.

  • verbose (bool) – whether to print out as it proceeds.

  • rand (bool) – whether to activate the stochastic approach.

  • criteria (string) – either distance or bondorder.

Returns:

a mapping from the old system to the new system where each fragment fullfills the purity criteria.

Return type:

dict

fragment_small_molecule(sys, view, cutoff=0.05, maxiter=100, verbose=False, seed=0)[source]

Stochastic algorithm for automatically fragmenting small molecules.

Parameters:
  • sys (BigDFT.Systems.System) – the system to fragment.

  • view (BigDFT.Systems.FragmentView) – a view of the system.

  • cutoff (float) – the purity cutoff.

  • maxiter (int) – how many iterations to perform.

  • verbose (bool) – optionally it will print out the progress.

  • seed (int) – a seed value, as this is a stochastic process.

Returns:

a mapping from the original fragments to the new fragments. (BigDFT.Systems.FragmentView): the new fragment view.

Return type:

(dict)

fragment_mask_matrix(sys, mat, fragments, log)[source]

Sometimes we don’t want to store an entire matrix, just the parts related to some fragments of interest. This routine will mask out a matrix, keeping entries only related to the list of fragments provided.

Parameters:
  • sys (BigDFT.Systems.System) – the system associated with the matrix.

  • mat (scipy.sparse.csr_matrix) – the matrix to mask.

  • fragments (list) – a list of fragment ids to keep.

  • log (BigDFT.Logfiles.Logfile) – the logfile associated with this matrix’s calculation.

Returns:

the masked matrix.

Return type:

(scipy.sparse.csr_matrix)

compute_fragment_dos(frag, log, ks_coeff, eigvals, frag_indices=None, smat=None, assume_pure=False, **kwargs)[source]

Compute the partial density of states projected onto a given fragment.

Parameters:
  • sys (BigDFT.Fragments.Fragment) – the fragment to project on to.

  • log (BigDFT.Logfiles.Logfile) – the log of the calculation.

  • ks_coeff (scipy.sparse.csc_matrix) – the matrix of eigenvectors.

  • eigvals (list) – a list of eigenvalues.

  • frag_indices (list) – list of indices associated with this fragment.

  • smat (scipy.sparse.csc_matrix) – the overlap matrix.

  • assume_pure (bool) – an optimization can be performed if we assume the target is pure.

  • kwargs (dict) – any extra options to pass to the DoS constructor.

Returns:

a density of states object built using the partial density of states.

Return type:

(BigDFT.DoS.DoS)

create_layered_qmmm_system(system, target, pairwise_bo, cutoffs, criteria='bondorder', link_atoms=False)[source]

Creates a multilayered system suitable for QM/MM calculations. For each layer, a suitable QM region is built around it.

Parameters:
  • system (BigDFT.Systems.System) – a System class, already broken up into fragments.

  • pairwise_bo (dict) – pairwise bond order values between all fragments in the system.

  • target (str) – the name of the fragment to treat as the target of the qm/mm run.

  • cutoffs (list) – a list of cutoff value for fragment interactions. The number of layers is equal to the number of cutoffs.

  • criteria (str) – how to determine which atoms are included in the QM region. Valid choices are “bondorder” and “distance”.

  • link_atoms (bool) – whether to generate link atoms.

Returns:

a list of Systems, one for each QM layer. (System): the MM region.

Return type:

(list)

create_qmmm_system(system, target, bond_order, cutoff, criteria='bondorder', link_atoms=False)[source]

Creates a system suitable for QM/MM calculations.

Parameters:
  • system (System) – a System class, already broken up into fragments.

  • target (str) – the name of the fragment to treat as the target of the qm/mm run.

  • bond_order (dict) – bond order values between the target fragment and all other fragments in the system.

  • cutoff (float) – a cutoff value for fragment interactions.

  • criteria (str) – how to determine which atoms are included in the QM region. Valid choices are “bondorder” and “distance”.

  • link_atoms (bool) – whether to generate link atoms.

Returns:

the QM region. (System): the MM region.

Return type:

(System)

fragment_bond_order(sys, fraglist1, fraglist2, log, kxs=None, frag_indices=None)[source]

Computes “bond order” between two sets of fragments using the method of Mayer. For two atomic fragments, this would describe the bond multiplicity of the covalent bond.

Parameters:
  • sys (BigDFT.Systems.System) – the system containing the fragments of interest

  • fraglist1 (list) – a list of fragments to compute the bond order of.

  • fraglist2 (list) – a list of fragments to compute the bond order between.

  • log (BigDFT.Logfiles.Logfile) – the log describing a calculation.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

  • frag_indices (dict) – the matrix indices associated with each fragment.

Returns:

a dictionary of dictionaries, mapping the bond order of each fragment in fraglist1 to each fragment in fraglist2.

Return type:

(dict)

fragment_interaction_energy(sys, fraglist1, fraglist2, log, frag_indices=None, sinvh=None, kxs=None)[source]

Compute the interaction energies between two sets of fragments.

Parameters:
  • fraglist1 (list) – a list of fragments to compute the interaction energy of.

  • fraglist2 (list) – a list of fragments to compute the interaction energy between.

  • log (BigDFT.Logfiles.Logfile) – the log describing a calculation.

  • frag_indices (dict) – the matrix indices associated with each fragment.

  • sinvh (scipy.sparse.csc_matrix) – the matrix S^{-1}*H, which might be already computed to reduce I/O time.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

Returns:

the projected energy.

Return type:

(float)

fragment_population(sys, log, frag_indices=None, kxs=None)[source]

Performs Mulliken population analysis on a fragment, in case charges haven’t been computed by doing a multipole analysis.

Parameters:
  • sys (BigDFT.Systems.System) – the system to compute the population of.

  • log (BigDFT.Logfiles.Logfile) – the log describing a calculation.

  • frag_indices (dict) – the matrix indices associated with each fragment.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

Returns:

a mapping from fragment ids to charges.

Return type:

(dict)

This routine adds link atoms to a subsystem based on the bond order of a full system. Link atom positions are automatically adjusted based on the length of some standard bonds.

Parameters:
  • fullsys (BigDFT.Systems.System) – the full system that the subsystem is embedded into.

  • subsys (BigDFT.Systems.System) – the embedded system which needs link atoms.

  • distcut (float) – this cutoff is the largest distance value we expect allow a bond to be.

Returns:

the subsystem with link atoms added. (BigDFT.Systems.System): a system which has the atoms that were

removed and replaced with link atoms.

Return type:

(BigDFT.Systems.System)

run_compute_purity(system, log, kxs=None, frag_list=None, frag_indices=None)[source]

Compute the purity values of the different system fragments.

Note that this can also be computed using the fragment multipoles, but this provides an implementation for when you don’t need those values.

Parameters:
  • system (System) – instance of a System class, which defines the fragmentation.

  • log (Logfile) – logfile from the run computed on this system.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

  • frag_list (list) – we can also only compute the purity values of some select fragments.

  • frag_indices (list) – the indices of the matrix associated with each fragment. This can be precomputed and passed.

Returns:

for each fragment id, what is the purity value.

Return type:

(dict)

get_frag_indices(sys, log)[source]

Compute a lookup table of matrix indices for each fragment in a system.

Parameters:
  • system (System) – instance of a System class, which defines the fragmentation.

  • log (Logfile) – logfile from the run computed on this system.

Returns:

a mapping of fragment ids to lists of indices

Return type:

(dict)

get_ks_coeff(log)[source]

Retrieve the Kohn-Sham coefficients and the eigenvalues in matrix form.

log (Logfile): instance of a Logfile class

Returns:

the matrix of coefficients (list): list of eigenvalues.

Return type:

(scipy.sparse.csc_matrix)

get_matrix_kxs(log)[source]

Computes the matrix K*S, the mulliken version of the density matrix, and loads it into memory.

log (Logfile): instance of a Logfile class

Returns:

the matrix K*S

Return type:

(scipy.sparse.csc_matrix)

derived_quantity(name, log, files, generator)[source]

Defines a quantity that can be derived from the ccs files in the files list. Requires the generator function.

ccs_node(log, matrixname)[source]

Defines the protocol to validate the ccs matrix file

get_matrix_sinvh(log)[source]

Computes the matrix S^{-1}*H, the mulliken version of the spillage matrix, and loads it into memory.

log (Logfile): instance of a Logfile class

Returns:

the matrix S^{-1}*H

Return type:

(scipy.sparse.csc_matrix)

get_matrix_s(log)[source]

Read the overlap matrix into memory.

log (Logfile): instance of a Logfile class

Returns:

the matrix S

Return type:

(scipy.sparse.csc_matrix)

get_matrix_h(log)[source]

Read the hamiltonian matrix into memory.

log (Logfile): instance of a Logfile class

Returns:

the matrix H

Return type:

(scipy.sparse.csc_matrix)

get_matrix_k(log)[source]

Read the density matrix into memory.

log (Logfile): instance of a Logfile class

Returns:

the matrix K

Return type:

(scipy.sparse.csc_matrix)

superunits_purities(bo, pv, q_units, superunits, q_superunits)[source]

Compute the purity values of the superunits described as a unification of the units

Parameters:
  • bo (matrix-like) – Fragment Bond Orders of the Units

  • pv (array-like) – Purity values of the Units

  • superunits (dict) – lookup dictionary containing the list of units per superunit

  • q_units (array-like) – charges of the units

  • q_superunits (array-like) – charges of the superunits

Returns:

purities of the superunits

Return type:

dict

superunits_quadratic_quantities(bo, superunits)[source]

Compute quantities that transforms like the bond orders of the superunits described as a unification of the units

Parameters:
  • bo (matrix-like) – Quantities like Fragment Bond Orders of the Units

  • superunits (dict) – lookup dictionary containing the list of units per superunit

Returns:

quantities of the superunits

Return type:

dict

systems_heatmap(data, restrict_to=None, axs=None, columns=None, **kwargs)[source]

Create a heatmap for a set of systems.

Parameters:
  • data (dict) – a dictionary mapping system names to another dictionary which maps fragment ids to property values.

  • restrict_to (list) – a list of lists saying what fragments we should draw. It is a list of lists instead of just a list because this way we can put a separator between values (for example, when making a jump in the sequence).

  • axs (matplotlib.axis) – the axis to draw on.

  • columns (list) – the order of the columns on the x axis

  • kwargs (dict) – any extra argument you wish to pass to matplotlib’s imshow command.

Returns:

a reference to the imshow value.

dict_distplot(system_dict, ax=None, reuse_ticks=False, kind='violin', **kwargs)[source]

Represent a violinplot from a dictionary of values.

Parameters:
  • system_dict (dict) – dictionary mapping the labels to plot with their values.

  • ax (matplotlib.pyplot.axis) – the axis to plot on.

  • reuse_ticks (bool) – Employ the existing xticks of the axes to plot the data

  • kind (str) – ‘violin’ for violinplot, ‘box’ for boxplot.

  • **kwargs – any other argument to be passed to the violin/boxplot

Returns:

the axis of the plot and the object returned from

the violinplot.

Return type:

tuple

_example()[source]

The following is an example of module usage:

"""
Postprocessing Example
"""
from BigDFT.Systems import System, FragmentView
from BigDFT.Fragments import Fragment
from BigDFT.IO import XYZReader
from BigDFT.Calculators import SystemCalculator
from BigDFT.Inputfiles import Inputfile
from scipy.linalg import eigh
from copy import deepcopy

# Create a system
sys = System()
sys["FRA:0"] = Fragment()
with XYZReader("CH2") as ifile:
    for at in ifile:
        sys["FRA:0"].append(at)
sys["FRA:1"] = Fragment()
with XYZReader("CH3") as ifile:
    for at in ifile:
        sys["FRA:1"].append(at)
sys["FRA:1"].translate([0, 0, -3])
sys["FRA:2"] = deepcopy(sys["FRA:0"])
sys["FRA:2"].translate([0, 0, 3])
sys["FRA:2"].rotate(y=150, units="degrees")
sys["FRA:3"] = deepcopy(sys["FRA:0"])
sys["FRA:3"].translate([3, 0, 1.5])

print(list(sys))

# Run a calculation in the linear mode.
inp = Inputfile()
inp.set_xc("PBE")
inp.set_hgrid(0.55)
inp.set_psp_nlcc()
inp["import"] = "linear"

code = SystemCalculator()
code.update_global_options(verbose=False)
log = code.run(input=inp, posinp=sys.get_posinp(), run_dir="work")
sys.set_atom_multipoles(log)

# Create the post processing tool.
from BigDFT.PostProcessing import BigDFTool
btool = BigDFTool()

# Purity
purity = btool.run_compute_purity(sys, log)
print(purity)

# Charges
charges = {fragid: sum(at.nel for at in frag)
           for fragid, frag in sys.items()}

# Bond Orders
bo = btool.fragment_bond_order(sys, sys.keys(), sys.keys(), log)

# Population values.
population = btool.fragment_population(sys, log)
print(population)

# These three things define a fragment view.
view = FragmentView(purity, bo, charges)

# Auto Fragmentation
mapping = btool.auto_fragment(sys, view, 0.10)
print(mapping)

# This defines a new view.
new_view = view.refragment(mapping)
print(new_view.purities)

# Eigenvalues.
H = btool.get_matrix_h(log)
S = btool.get_matrix_s(log)
w = eigh(H.todense(), b=S.todense(), eigvals_only=True)
print(w)